Towards Unrestricted Lip Reading

نویسندگان

  • Uwe Meier
  • Rainer Stiefelhagen
  • Jie Yang
  • Alexander H. Waibel
چکیده

Lip reading provides useful information in speech perception and language understanding, especially when the auditory speech is degraded. However, many current automatic lip reading systems impose some restrictions on users. In this paper, we present our research e orts, in the Interactive System Laboratory, towards unrestricted lip reading. We rst introduce a top-down approach to automatically track and extract lip regions. This technique makes it possible to acquire visual information in real-time without limiting user's freedom of movement. We then discuss normalization algorithms to preprocess images for di erent lightning conditions (global illumination and side illumination). We also compare di erent visual preprocessing methods such as raw image, Linear Discriminant Analysis (LDA), and Principle Component Analysis (PCA). We demonstrate the feasibility of the proposed methods by development of a modular system for exible human-computer interaction via both visual and acoustic speech. The system is based on an extension of an existing state-of-the-art speech recognition system, a modular Multiple State-Time Delayed Neural Network (MS-TDNN) system. We have developed adaptive combination methods at several di erent levels of the recognition network. The system can automatically track a speaker and extract his/her lip region in real-time. The system has been evaluated under di erent noisy conditions such as white noise, music, and mechanical noise. The experimental results indicate that the system can achieve up to 55% error reduction using additional visual information.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lip Tracking Towards an Automatic Lip Reading Approach

Current era is to make the interaction between humans and their artificial partners (Computers) and make communication easier and more reliable. One of the actual tasks is the use of vocal interaction. Speech recognition may be improved by visual information of human face. In literature, the lip shape and its movement are referred to as lip reading. Lip reading computing plays a vital role in a...

متن کامل

Designing and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods

For many years, speech has been the most natural and efficient means of information exchange for human beings. With the advancement of technology and the prevalence of computer usage, the design and production of speech recognition systems have been considered by researchers. Among this, lip-reading techniques encountered with many challenges for speech recognition, that one of the challenges b...

متن کامل

لب‌خوانی و ادراک گفتار دانش‌آموزان کم‌شنوای مدارس ویژۀ کم‌شنوایان در شهر تهران

Objective: The goal of this study was to evaluate the lip reading ability and Speech perception of hearing impaired students of special schools for the hearing impaired in different speech levels. Materials & Methods: In this cross- sectional study, 44 deaf students (9-12 years old) were selected with multi-stage cluster sampling method, from two special schools for the deaf in Tehran. Tools...

متن کامل

Limitations of visual speech recognition

In this paper we investigate the limits of automated lip-reading systems and we consider the improvement that could be gained were additional information from other (non-visible) speech articulators available to the recogniser. Hidden Markov model (HMM) speech recognisers are trained using electromagnetic articulography (EMA) data drawn from the MOCHA-TIMIT data set. Articulatory information is...

متن کامل

Dictionary-Based Lip Reading Classification

Visual lip reading recognition is an essential stage in many multimedia systems such as “Audio Visual Speech Recognition” [6], “Mobile Phone Visual System for deaf people”, “Sign Language Recognition System”, etc. The use of lip visual features to help audio or hand recognition is appropriate because this information is robust to acoustic noise. In this paper, we describe our work towards devel...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJPRAI

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2000